feat: batch cold account loads in light client#2341
feat: batch cold account loads in light client#2341sergeytimoshin wants to merge 9 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a const-generic AddressBatchSnapshot, replaces get_batch_data with get_batch_snapshot returning Result<Option<...>>, makes proof reconstruction HEIGHT-generic, propagates tree_next_insertion_index through indexer outputs, and tightens error propagation across forester, prover client, and SDK indexer code paths. Changes
Sequence Diagram(s)sequenceDiagram
participant Stream as StreamingQueue
participant Forester as Forester::ProcessorV2
participant Staging as StagingTree
participant Prover as ProverClient
Stream->>Forester: get_batch_snapshot<const HEIGHT>(start,end,hashchain_idx)
Stream-->>Forester: AddressBatchSnapshot{addresses, low_values, indices, proofs, leaves_hashchain}
Forester->>Staging: staging_tree.process_batch(snapshot, metadata)
Staging-->>Forester: ProcessResult{proof_input, new_root}
Forester->>Prover: prepare circuit inputs (borrowed slices & fixed-size proofs)
Prover-->>Forester: circuit inputs / proof result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
sdk-libs/program-test/src/indexer/test_indexer.rs (1)
2498-2534:⚠️ Potential issue | 🔴 CriticalNon-2xx prover responses can hang this loop indefinitely.
When
send()succeeds but the prover returns a 4xx/5xx, the loop neither decrementsretriesnor returns an error, so it can spin forever. The same block also still unwrapstext()anddeserialize_gnark_proof_json(), which reintroduces panics on malformed responses. Please route both status failures and decode failures through the sameIndexerErrorpath.💡 Proposed fix
let mut retries = 3; while retries > 0 { let response_result = client .post(format!("{}{}", SERVER_ADDRESS, PROVE_PATH)) .header("Content-Type", "text/plain; charset=utf-8") .body(json_payload.clone()) .send() .await; - if let Ok(response_result) = response_result { - if response_result.status().is_success() { - let body = response_result.text().await.unwrap(); - let proof_json = deserialize_gnark_proof_json(&body).unwrap(); + match response_result { + Ok(response_result) if response_result.status().is_success() => { + let body = response_result + .text() + .await + .map_err(|e| IndexerError::CustomError(e.to_string()))?; + let proof_json = deserialize_gnark_proof_json(&body) + .map_err(|e| IndexerError::CustomError(e.to_string()))?; let (proof_a, proof_b, proof_c) = proof_from_json_struct(proof_json) .map_err(|e| IndexerError::CustomError(e.to_string()))?; let (proof_a, proof_b, proof_c) = compress_proof(&proof_a, &proof_b, &proof_c) .map_err(|e| IndexerError::CustomError(e.to_string()))?; return Ok(ValidityProofWithContext { accounts: account_proof_inputs, addresses: address_proof_inputs, proof: CompressedProof { a: proof_a, b: proof_b, c: proof_c, } .into(), }); } - } else { - println!("Error: {:#?}", response_result); - tokio::time::sleep(Duration::from_secs(5)).await; - retries -= 1; + Ok(response_result) => { + let status = response_result.status(); + let body = response_result.text().await.unwrap_or_default(); + retries -= 1; + if retries == 0 { + return Err(IndexerError::CustomError(format!( + "Prover returned {}: {}", + status, body + ))); + } + } + Err(err) => { + retries -= 1; + if retries == 0 { + return Err(IndexerError::CustomError(err.to_string())); + } + } } + tokio::time::sleep(Duration::from_secs(5)).await; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk-libs/program-test/src/indexer/test_indexer.rs` around lines 2498 - 2534, The loop currently only decrements retries on transport errors and unwraps response body and deserialization, causing hangs or panics for non-2xx or malformed responses; change the logic in the while retries loop that starts with `let response_result = client.post(...).send().await` to treat any non-success status as a failure path that decrements `retries`, sleeps, and eventually returns an `IndexerError::CustomError` if exhausted. Specifically: when `response_result` is Ok but `response_result.status().is_success()` is false, read the body safely (await and map errors) and convert the status+body into an `IndexerError::CustomError`, decrement `retries` and sleep before retrying; replace all `.unwrap()` calls on `response_result.text().await` and `deserialize_gnark_proof_json(&body)` with error-mapped returns into `IndexerError::CustomError`; keep the successful branch that calls `proof_from_json_struct` and `compress_proof` but also map their errors (they already use map_err) and ensure any failure there returns `Err(IndexerError::CustomError(...))` rather than panicking. This ensures `retries` is decremented for both transport and non-2xx responses and avoids unwrap-induced panics while preserving use of `deserialize_gnark_proof_json`, `proof_from_json_struct`, `compress_proof`, `ValidityProofWithContext`, and `CompressedProof`.prover/client/src/proof_types/batch_address_append/proof_inputs.rs (1)
556-559: 🧹 Nitpick | 🔵 TrivialRedundant slice:
[0..]is unnecessary.
new_element_values[0..]is equivalent to justnew_element_values. This appears to be a leftover from a refactor.✨ Suggested simplification
- new_element_values: new_element_values[0..] - .iter() + new_element_values: new_element_values + .iter()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs` around lines 556 - 559, The slice new_element_values[0..] is redundant; update the mapping to iterate directly over new_element_values (i.e., replace new_element_values[0..].iter() with new_element_values.iter()) so the BigUint conversion via BigUint::from_bytes_be(v) and the collect() behavior remain unchanged; touch the expression in proof_inputs.rs where new_element_values is used in the map to remove the unnecessary slice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 246-257: Add an IntegerConversion variant to the ProverClientError
enum so the try_into() error mappings in the batch address append code compile;
specifically add a variant like IntegerConversion(String) with the
#[error("Integer conversion error: {0}")] attribute to the ProverClientError
enum in errors.rs so calls that return ProverClientError::IntegerConversion from
the low_element_index / low_element_next_index conversion sites
(low_element_indices, low_element_next_indices) are valid.
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 70-73: The const-generic height is unchecked and can produce
truncated or missing proofs; update the proof-producing functions (e.g.,
reconstruct_proof and the other proof-returning methods in this file) to
validate the supplied const HEIGHT against the fixed address-tree height (40)
before constructing/returning a proof and return an appropriate IndexerError if
HEIGHT != 40 (alternatively persist the queue height on the struct and compare
it), ensuring any code paths that previously assumed a different HEIGHT fail
early with a clear error.
---
Outside diff comments:
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 556-559: The slice new_element_values[0..] is redundant; update
the mapping to iterate directly over new_element_values (i.e., replace
new_element_values[0..].iter() with new_element_values.iter()) so the BigUint
conversion via BigUint::from_bytes_be(v) and the collect() behavior remain
unchanged; touch the expression in proof_inputs.rs where new_element_values is
used in the map to remove the unnecessary slice.
In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2498-2534: The loop currently only decrements retries on transport
errors and unwraps response body and deserialization, causing hangs or panics
for non-2xx or malformed responses; change the logic in the while retries loop
that starts with `let response_result = client.post(...).send().await` to treat
any non-success status as a failure path that decrements `retries`, sleeps, and
eventually returns an `IndexerError::CustomError` if exhausted. Specifically:
when `response_result` is Ok but `response_result.status().is_success()` is
false, read the body safely (await and map errors) and convert the status+body
into an `IndexerError::CustomError`, decrement `retries` and sleep before
retrying; replace all `.unwrap()` calls on `response_result.text().await` and
`deserialize_gnark_proof_json(&body)` with error-mapped returns into
`IndexerError::CustomError`; keep the successful branch that calls
`proof_from_json_struct` and `compress_proof` but also map their errors (they
already use map_err) and ensure any failure there returns
`Err(IndexerError::CustomError(...))` rather than panicking. This ensures
`retries` is decremented for both transport and non-2xx responses and avoids
unwrap-induced panics while preserving use of `deserialize_gnark_proof_json`,
`proof_from_json_struct`, `compress_proof`, `ValidityProofWithContext`, and
`CompressedProof`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9c2af6fd-b65c-41f4-8efc-056487d57845
⛔ Files ignored due to path filters (5)
forester-utils/src/address_staging_tree.rsis excluded by none and included by noneprogram-tests/utils/src/e2e_test_env.rsis excluded by none and included by noneprogram-tests/utils/src/test_batch_forester.rsis excluded by none and included by nonesparse-merkle-tree/src/indexed_changelog.rsis excluded by none and included by nonesparse-merkle-tree/tests/indexed_changelog.rsis excluded by none and included by none
📒 Files selected for processing (8)
forester/src/processor/v2/helpers.rsforester/src/processor/v2/strategy/address.rsprover/client/src/proof_types/batch_address_append/proof_inputs.rsprover/client/tests/batch_address_append.rssdk-libs/client/src/indexer/photon_indexer.rssdk-libs/client/src/indexer/types/queue.rssdk-libs/client/src/interface/load_accounts.rssdk-libs/program-test/src/indexer/test_indexer.rs
238cb7f to
226df79
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
prover/client/src/helpers.rs (1)
49-54:⚠️ Potential issue | 🟠 MajorWiden
path_indexbefore address trees crossu32::MAX.
compute_root_from_merkle_proofis now the shared fallible helper for both state and address trees, but it still encodes the path index asu32. The address-append callers already holdusize/u64indices for 40-level trees and currently downcast into this API; oncenext_indexexceeds4_294_967_295, those casts will silently truncate the upper bits and compute the wrong root and changelog. Useusizeoru64here and narrow only at a boundary that truly requires it.Based on learnings: Address trees must initialize with
ADDRESS_TREE_INIT_ROOT_40(hardcoded for height 40) and start with next_index = 1.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/helpers.rs` around lines 49 - 54, The function compute_root_from_merkle_proof currently accepts path_index: u32 which causes truncation for address trees deeper than u32::MAX; change the parameter to a wider integer (prefer usize or u64) in the signature and all internal uses (e.g., ChangelogEntry::default_with_index call and any arithmetic) and only downcast to u32 at the true boundary if an API you call requires u32; also update any callers that downcast from usize/u64 to pass the wider type. Additionally ensure address-tree initialization uses ADDRESS_TREE_INIT_ROOT_40 and that the address-tree next_index starts at 1 (update constructors/initializers that set next_index) so address trees for height 40 behave correctly.prover/client/src/proof_types/batch_append/proof_inputs.rs (1)
180-194:⚠️ Potential issue | 🟠 MajorRemove the remaining proof-length panic in this path.
The new
?only coverscompute_root_from_merkle_proof;merkle_proof.try_into().unwrap()still aborts the process if the indexer returns a proof with the wrong height. This function already returnsProverClientError, so surface the mismatch instead of panicking.Suggested fix
- let merkle_proof_array = merkle_proof.try_into().unwrap(); + let proof_len = merkle_proof.len(); + let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| { + ProverClientError::GenericError(format!( + "Invalid merkle proof length: got {}, expected {}", + proof_len, HEIGHT + )) + })?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/proof_types/batch_append/proof_inputs.rs` around lines 180 - 194, The code still panics on merkle_proof.try_into().unwrap(); replace the unwrap with fallible conversion that returns a ProverClientError instead of aborting: call merkle_proof.try_into().map_err(|e| ProverClientError::InvalidProofLength { source: e.into(), context: format!("expected height for index {}", start_index + i as u32) })? (or perform a match and return an appropriate ProverClientError), then continue using merkle_proof_array with compute_root_from_merkle_proof(final_leaf, &merkle_proof_array, start_index + i as u32)? so the function surfaces proof-length mismatches via its existing ProverClientError return type rather than panicking on unwrap.prover/client/src/proof_types/batch_update/proof_inputs.rs (1)
171-178:⚠️ Potential issue | 🟠 MajorFinish the fallible-path cleanup in
get_batch_update_inputs.This call site now propagates
compute_root_from_merkle_proof, but the proof conversion andPoseidon::hashvimmediately above stillunwrap(). A malformed proof or invalid hash input will still panic the process instead of returningProverClientError.Suggested fix
- let merkle_proof_array = merkle_proof.try_into().unwrap(); + let proof_len = merkle_proof.len(); + let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| { + ProverClientError::GenericError(format!( + "Invalid merkle proof length: got {}, expected {}", + proof_len, HEIGHT + )) + })?; @@ - let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]]).unwrap(); + let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]])?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/proof_types/batch_update/proof_inputs.rs` around lines 171 - 178, The code in get_batch_update_inputs currently uses merkle_proof.try_into().unwrap() and Poseidon::hashv(...).unwrap(), which can panic; change both to propagate errors as ProverClientError instead: replace merkle_proof.try_into().unwrap() with merkle_proof.try_into().map_err(|e| ProverClientError::from(...))? (or an appropriate ProverClientError variant) and replace Poseidon::hashv(...).unwrap() with Poseidon::hashv(...).map_err(|e| ProverClientError::from(...))? so that malformed proofs or hash failures return an Err from get_batch_update_inputs and remain consistent with the compute_root_from_merkle_proof error propagation.prover/client/src/proof_types/batch_address_append/proof_inputs.rs (1)
369-489:⚠️ Potential issue | 🟠 MajorKeep the staging mutations transactional on
Err.This function now returns recoverable errors after it has already pushed into
changelog/indexed_changelogand mutatedsparse_merkle_tree(for example the latercompute_root_from_merkle_proof(...) ?calls and the final ordering check). If any of those fail, the caller getsErrbut the local staging state is left partially advanced, so the next retry or batch runs against corrupted state. Build into temporary state and commit only once the whole batch succeeds, or add explicit rollback.Also applies to: 500-518
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs` around lines 369 - 489, The code mutates staging state (changelog, indexed_changelog, sparse_merkle_tree, and pushes via patcher.push_changelog_entry::<HEIGHT>, sparse_merkle_tree.append, and indexed_changelog.push) before fallible operations complete, so on Err the local state is partially advanced; fix by performing all fallible computations on temporary/clone state (clone sparse_merkle_tree, collect changelog entries and merkle proof arrays into local Vecs like new_element_circuit_merkle_proofs_tmp / indexed_changelog_tmp) and only call patcher.push_changelog_entry::<HEIGHT>, sparse_merkle_tree.append (or replace the real tree), and indexed_changelog.push once the entire batch loop finishes successfully; alternatively implement an explicit rollback path that reverts mutations (undo append and pop changelog) on any early Err — apply same change for the other block around lines 500-518 where similar pushes occur.sdk-libs/program-test/src/indexer/test_indexer.rs (1)
2500-2514:⚠️ Potential issue | 🔴 CriticalNon-2xx prover responses hot-loop forever here.
When
send()succeeds but the prover returns4xx/5xx,retriesis never decremented and the loop immediately issues another request. That turns a transient server error into an unbounded tight retry, and the adjacentunwrap()calls ontext()/deserialize_gnark_proof_json()still panic on malformed responses instead of surfacingIndexerError.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk-libs/program-test/src/indexer/test_indexer.rs` around lines 2500 - 2514, The loop currently treats any Ok(response) the same and only decrements retries on send() errors, causing tight infinite retries for 4xx/5xx and panics from unwrap() on malformed bodies; update the block that handles Ok(response_result) so that non-success HTTP statuses decrement retries (and optionally sleep) instead of immediately retrying, and convert the body/text and JSON parsing unwraps into fallible operations that map errors into IndexerError (e.g. replace response_result.text().await.unwrap() and deserialize_gnark_proof_json(...).unwrap() with ?-style error mapping to IndexerError), and ensure that any failure in proof_from_json_struct or compress_proof is returned as Err(IndexerError::CustomError(...)) rather than panicking; reference the response handling around deserialize_gnark_proof_json, proof_from_json_struct, compress_proof and IndexerError to locate the changes.forester/src/processor/v2/strategy/address.rs (1)
269-308:⚠️ Potential issue | 🟠 MajorRequire a full ZKP batch before calling
process_batch.
StreamingAddressQueue::wait_for_batch()can return early after its timeout, soget_batch_snapshot()may hand back fewer thanzkp_batch_size_usizeaddresses. Feedingaddresses.len()downstream aszkp_batch_size_actualturns that short snapshot into a real address-append job instead of treating it as "not ready yet", which means a slow indexer can produce inputs for an unsupported circuit size.Suggested guard
- let zkp_batch_size_actual = addresses.len(); + if addresses.len() != zkp_batch_size_usize { + return Ok(None); + } let result = staging_tree .process_batch( &addresses, &low_element_values, &low_element_next_values, &low_element_indices, &low_element_next_indices, &low_element_proofs, leaves_hashchain, - zkp_batch_size_actual, + zkp_batch_size_usize, epoch, tree, )Based on learnings: ZKP batch sizes for address trees must be 10 or 250 in
src/initialize_address_tree.rs. Only these circuit sizes are supported for address trees.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@forester/src/processor/v2/strategy/address.rs` around lines 269 - 308, The snapshot returned by StreamingAddressQueue::get_batch_snapshot may be shorter than the ZKP batch size; before calling staging_tree.process_batch, verify that addresses.len() equals zkp_batch_size_usize (the expected batch size) and, if it is smaller, treat the batch as "not ready" (do not call process_batch) — e.g., return/propagate a no-op/NotReady result or retry later. Update the logic around AddressBatchSnapshot retrieval (get_batch_snapshot::<{ DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>) to check zkp_batch_size_actual against zkp_batch_size_usize and only pass a full batch into staging_tree.process_batch; ensure the code path preserves the existing error mapping via map_address_staging_error when a full batch is processed.
♻️ Duplicate comments (1)
sdk-libs/client/src/indexer/types/queue.rs (1)
70-73:⚠️ Potential issue | 🟠 MajorReject non-40 proof heights before reconstruction.
These methods now accept any const
HEIGHT, but address-queue proofs are only valid at height 40.HEIGHT < 40silently truncates the witness,HEIGHT > 40degrades into a lateMissingResult, andencode_node_index()only reserves 8 bits forlevel, so oversized values can alias node ids. Add one shared height guard before any proof reconstruction path and fail early with a clearIndexerError.Based on learnings: "Tree height must be fixed at 40 for address trees in
src/initialize_address_tree.rs. Validate that height equals 40."Also applies to: 125-143, 155-205, 209-210
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk-libs/client/src/indexer/types/queue.rs` around lines 70 - 73, Reject any const HEIGHT != 40 up-front in all proof-reconstruction entry points (e.g., reconstruct_proof and the other methods referenced) by adding a single shared guard that checks HEIGHT == 40 and returns a clear IndexerError when it is not; update any call sites that route into encode_node_index or witness/truncation logic so they never run for non-40 heights, and ensure the error message/variant clearly states expected=40 and the provided height to make debugging explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 146-152: build_node_lookup currently maps node ids to the last
index because collect() overwrites duplicates, which breaks first-match
semantics used by reconstruct_proof (which uses position()); update
build_node_lookup to either detect and reject duplicate node ids or preserve the
first index by using HashMap::entry().or_insert(...) so the first occurrence
remains in the map (refer to function build_node_lookup and callers like
reconstruct_proof to ensure consistent behavior).
In `@sdk-libs/client/src/interface/load_accounts.rs`:
- Around line 213-216: The debug-only check for max_per_group (> 0) is
inconsistent with the runtime branch that returns an empty Vec when
max_per_group == 0; make the behavior consistent by failing in all builds:
replace the debug_assert!(max_per_group > 0, ...) with a plain
assert!(max_per_group > 0, ...) (or otherwise return an Err if this function is
fallible) and remove the special-case path that returns Vec::new() when
max_per_group == 0 so that invalid zero input always triggers the same failure;
ensure references to max_per_group and specs remain and remove the silent
Vec::new() return for the zero case.
---
Outside diff comments:
In `@forester/src/processor/v2/strategy/address.rs`:
- Around line 269-308: The snapshot returned by
StreamingAddressQueue::get_batch_snapshot may be shorter than the ZKP batch
size; before calling staging_tree.process_batch, verify that addresses.len()
equals zkp_batch_size_usize (the expected batch size) and, if it is smaller,
treat the batch as "not ready" (do not call process_batch) — e.g.,
return/propagate a no-op/NotReady result or retry later. Update the logic around
AddressBatchSnapshot retrieval (get_batch_snapshot::<{
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>) to check zkp_batch_size_actual
against zkp_batch_size_usize and only pass a full batch into
staging_tree.process_batch; ensure the code path preserves the existing error
mapping via map_address_staging_error when a full batch is processed.
In `@prover/client/src/helpers.rs`:
- Around line 49-54: The function compute_root_from_merkle_proof currently
accepts path_index: u32 which causes truncation for address trees deeper than
u32::MAX; change the parameter to a wider integer (prefer usize or u64) in the
signature and all internal uses (e.g., ChangelogEntry::default_with_index call
and any arithmetic) and only downcast to u32 at the true boundary if an API you
call requires u32; also update any callers that downcast from usize/u64 to pass
the wider type. Additionally ensure address-tree initialization uses
ADDRESS_TREE_INIT_ROOT_40 and that the address-tree next_index starts at 1
(update constructors/initializers that set next_index) so address trees for
height 40 behave correctly.
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 369-489: The code mutates staging state (changelog,
indexed_changelog, sparse_merkle_tree, and pushes via
patcher.push_changelog_entry::<HEIGHT>, sparse_merkle_tree.append, and
indexed_changelog.push) before fallible operations complete, so on Err the local
state is partially advanced; fix by performing all fallible computations on
temporary/clone state (clone sparse_merkle_tree, collect changelog entries and
merkle proof arrays into local Vecs like new_element_circuit_merkle_proofs_tmp /
indexed_changelog_tmp) and only call patcher.push_changelog_entry::<HEIGHT>,
sparse_merkle_tree.append (or replace the real tree), and indexed_changelog.push
once the entire batch loop finishes successfully; alternatively implement an
explicit rollback path that reverts mutations (undo append and pop changelog) on
any early Err — apply same change for the other block around lines 500-518 where
similar pushes occur.
In `@prover/client/src/proof_types/batch_append/proof_inputs.rs`:
- Around line 180-194: The code still panics on
merkle_proof.try_into().unwrap(); replace the unwrap with fallible conversion
that returns a ProverClientError instead of aborting: call
merkle_proof.try_into().map_err(|e| ProverClientError::InvalidProofLength {
source: e.into(), context: format!("expected height for index {}", start_index +
i as u32) })? (or perform a match and return an appropriate ProverClientError),
then continue using merkle_proof_array with
compute_root_from_merkle_proof(final_leaf, &merkle_proof_array, start_index + i
as u32)? so the function surfaces proof-length mismatches via its existing
ProverClientError return type rather than panicking on unwrap.
In `@prover/client/src/proof_types/batch_update/proof_inputs.rs`:
- Around line 171-178: The code in get_batch_update_inputs currently uses
merkle_proof.try_into().unwrap() and Poseidon::hashv(...).unwrap(), which can
panic; change both to propagate errors as ProverClientError instead: replace
merkle_proof.try_into().unwrap() with merkle_proof.try_into().map_err(|e|
ProverClientError::from(...))? (or an appropriate ProverClientError variant) and
replace Poseidon::hashv(...).unwrap() with Poseidon::hashv(...).map_err(|e|
ProverClientError::from(...))? so that malformed proofs or hash failures return
an Err from get_batch_update_inputs and remain consistent with the
compute_root_from_merkle_proof error propagation.
In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2500-2514: The loop currently treats any Ok(response) the same and
only decrements retries on send() errors, causing tight infinite retries for
4xx/5xx and panics from unwrap() on malformed bodies; update the block that
handles Ok(response_result) so that non-success HTTP statuses decrement retries
(and optionally sleep) instead of immediately retrying, and convert the
body/text and JSON parsing unwraps into fallible operations that map errors into
IndexerError (e.g. replace response_result.text().await.unwrap() and
deserialize_gnark_proof_json(...).unwrap() with ?-style error mapping to
IndexerError), and ensure that any failure in proof_from_json_struct or
compress_proof is returned as Err(IndexerError::CustomError(...)) rather than
panicking; reference the response handling around deserialize_gnark_proof_json,
proof_from_json_struct, compress_proof and IndexerError to locate the changes.
---
Duplicate comments:
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 70-73: Reject any const HEIGHT != 40 up-front in all
proof-reconstruction entry points (e.g., reconstruct_proof and the other methods
referenced) by adding a single shared guard that checks HEIGHT == 40 and returns
a clear IndexerError when it is not; update any call sites that route into
encode_node_index or witness/truncation logic so they never run for non-40
heights, and ensure the error message/variant clearly states expected=40 and the
provided height to make debugging explicit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f30f9f95-bb7c-4dcf-b433-899ee6f79ada
⛔ Files ignored due to path filters (5)
forester-utils/src/address_staging_tree.rsis excluded by none and included by noneprogram-tests/utils/src/e2e_test_env.rsis excluded by none and included by noneprogram-tests/utils/src/test_batch_forester.rsis excluded by none and included by nonesparse-merkle-tree/src/indexed_changelog.rsis excluded by none and included by nonesparse-merkle-tree/tests/indexed_changelog.rsis excluded by none and included by none
📒 Files selected for processing (12)
forester/src/processor/v2/helpers.rsforester/src/processor/v2/strategy/address.rsprover/client/src/errors.rsprover/client/src/helpers.rsprover/client/src/proof_types/batch_address_append/proof_inputs.rsprover/client/src/proof_types/batch_append/proof_inputs.rsprover/client/src/proof_types/batch_update/proof_inputs.rsprover/client/tests/batch_address_append.rssdk-libs/client/src/indexer/photon_indexer.rssdk-libs/client/src/indexer/types/queue.rssdk-libs/client/src/interface/load_accounts.rssdk-libs/program-test/src/indexer/test_indexer.rs
bfa6672 to
e70763e
Compare
e70763e to
10ea034
Compare
19530a6 to
a39fe5d
Compare
a39fe5d to
ed44b77
Compare
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
prover/client/src/helpers.rs (1)
49-54:⚠️ Potential issue | 🟠 MajorMake
path_indexwide enough for address-tree indices.The helper is now
HEIGHT-generic, butpath_indexis stillu32. The new address-append callers already hold indices asusizeand have to downcast them withas u32; once an address tree passesu32::MAX, root reconstruction will wrap and produce the wrong path/changelog long before the tree is full.Proposed fix
pub fn compute_root_from_merkle_proof<const HEIGHT: usize>( leaf: [u8; 32], path_elements: &[[u8; 32]; HEIGHT], - path_index: u32, + path_index: usize, ) -> Result<([u8; 32], ChangelogEntry<HEIGHT>), ProverClientError> { - let mut changelog_entry = ChangelogEntry::default_with_index(path_index as usize); + let mut changelog_entry = ChangelogEntry::default_with_index(path_index); let mut current_hash = leaf; let mut current_index = path_index;Then update the address-append call sites to pass
low_element.index,new_low_element.index, andcurrent_indexdirectly, without lossy casts.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/helpers.rs` around lines 49 - 54, The function compute_root_from_merkle_proof currently takes path_index: u32 which can overflow for large address trees; change its signature to accept path_index: usize and update the internal use of ChangelogEntry::default_with_index(path_index as usize) to pass the usize directly (ChangelogEntry::default_with_index(path_index)), then update all callers (the address-append sites that currently cast indices with as u32) to pass their usize indices directly (e.g., low_element.index, new_low_element.index, current_index) so there is no lossy cast or potential wrap-around.prover/client/tests/batch_address_append.rs (1)
87-109: 🧹 Nitpick | 🔵 TrivialAdd a non-empty changelog regression here.
This test still starts with empty
changelogandindexed_changelog, so it only exercises the first-batch branch. The new proof-patching logic is only used once those vectors already contain prior entries, which is the riskier path introduced in this PR.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/tests/batch_address_append.rs` around lines 87 - 109, The test currently initializes changelog and indexed_changelog as empty vectors so get_batch_address_append_circuit_inputs only covers the first-batch path; pre-populate both changelog: Vec<ChangelogEntry<{ DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> and indexed_changelog: Vec<IndexedChangelogEntry<usize, { DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> with one or more realistic entries (matching the ChangelogEntry and IndexedChangelogEntry shapes used elsewhere in tests) before calling get_batch_address_append_circuit_inputs so the function exercises the proof-patching logic for non-empty prior entries; ensure the synthetic entries use valid roots/indices/values consistent with the sparse_merkle_tree and DEFAULT_BATCH_ADDRESS_TREE_HEIGHT expectations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@forester/src/processor/v2/helpers.rs`:
- Around line 489-534: In get_batch_snapshot: reject truncated snapshots early
by changing the available check to return Ok(None) unless available >= end
(i.e., if self.wait_for_batch(end) < end -> Ok(None)) so we never construct
proofs for partial batches; before slicing validate that
data.low_element_values, low_element_next_values, low_element_indices,
low_element_next_indices and data.addresses have length >= actual_end and that
data.reconstruct_proofs::<HEIGHT>(start..actual_end) can be produced (otherwise
return an error like "incomplete batch data"); keep the existing
leaves_hash_chains fallback logic but do not proceed to build
AddressBatchSnapshot when any of these sibling/vector length checks fail.
In `@prover/client/src/proof_types/batch_address_append/proof_inputs.rs`:
- Around line 202-209: Before slicing new_element_values and allocating
per-element vectors, validate that new_element_values.len() >= zkp_batch_size
and that each corresponding input slice (the low_element_* and new_element_*
sources used later) has at least zkp_batch_size elements; if any check fails
return a ProverClientError instead of panicking. Move the check to just before
the line creating the slice (currently using let new_element_values =
&new_element_values[..zkp_batch_size];), only perform the slice after
validation, and use a single safe variable (e.g., batch_len = zkp_batch_size)
when creating capacities and indexing in subsequent loops so all
Vec::with_capacity(...) and later indexing use the validated length. Ensure the
error returned is descriptive (e.g., "truncated batch from indexer") and that
all low_element_* and new_element_* sources are validated the same way to avoid
any later panics.
In `@prover/client/src/prover.rs`:
- Around line 54-57: health_check currently builds a reqwest client with
no_proxy() while ProofClient::local(), ProofClient::with_config(), and
ProofClient::with_full_config() use Client::new(), causing divergent proxy
behavior and a silent ignore of builder errors; create a shared helper (e.g.,
build_http_client() -> Result<reqwest::Client, anyhow::Error>) that constructs
the client with the desired proxy policy (no_proxy() or configurable as
decided), return and propagate the Result so callers can log errors instead of
silently returning, then refactor health_check and all ProofClient constructors
to call this helper (replace direct Client::new() and the current builder call),
and ensure any builder Err is logged via the existing logger before returning an
error.
In `@sdk-libs/client/src/indexer/photon_indexer.rs`:
- Around line 1145-1154: The code currently only checks x.proof.len() <
STATE_MERKLE_TREE_CANOPY_DEPTH which allows truncated proofs; update the
validation before building the MerkleProof (around x.proof, proof_len logic) to
require the post-canopy proof length equals the expected sibling count (i.e.
expected_siblings = STATE_MERKLE_TREE_HEIGHT - STATE_MERKLE_TREE_CANOPY_DEPTH or
the known constant 16 for height 26/canopy 10). If x.proof.len() does not equal
STATE_MERKLE_TREE_CANOPY_DEPTH + expected_siblings (or proof_len !=
expected_siblings) return Err(IndexerError::InvalidParameters(...)) instead of
silently slicing and constructing the MerkleProof.
In `@sdk-libs/client/src/indexer/types/queue.rs`:
- Around line 128-137: The method reconstruct_proofs currently reserves capacity
using address_range.len() before checking that the requested range is within the
available proofs; validate the range bounds up front (before Vec::with_capacity)
by comparing address_range.start and address_range.end against the available
proof count (e.g., self.proof_count() or the relevant collection length on
self), and return an IndexerError for out-of-range requests; only then call
Vec::with_capacity and proceed to call build_node_lookup and
reconstruct_proof_with_lookup::<HEIGHT>.
In `@sdk-libs/client/src/interface/load_accounts.rs`:
- Around line 120-124: The failure index reported by MissingPdaCompressed is
non-unique because collect_pda_hashes(group) restarts enumerate() per batch;
change the code so the global cold-PDA offset is carried into each group (or
remove the index from MissingPdaCompressed). Concretely: modify
collect_pda_hashes to accept a start_index/offset (or create a new
collect_pda_hashes_with_offset) and when mapping pda_groups pass the accumulated
offset (e.g., outer_idx * MAX_PDAS_PER_IX or running total) so that the inner
enumerate adds that offset when constructing MissingPdaCompressed.index;
alternatively, remove the index field from MissingPdaCompressed to match
MissingPdaCompressedData for consistency.
- Around line 244-259: fetch_individual_proofs spawns all RPCs at once which can
overwhelm the indexer; replace the try_join_all approach with a bounded
concurrency stream so requests are limited but results keep original order. Use
futures::stream::iter over the hashes, map each hash to the async
indexer.get_validity_proof call, then apply StreamExt::buffered(CONCURRENCY) (or
a named constant) and collect the responses into a Vec to preserve positional
order; apply the same pattern to fetch_proof_batches (use the same CONCURRENCY
constant and the same stream->buffered->collect flow) so both functions limit
concurrent RPCs while keeping result ordering for subsequent zip operations.
In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 729-731: The response currently sets start_index to
address_tree_bundle.right_most_index(), which breaks pagination semantics
because the returned addresses slice is still paginated by
options.address_queue_start_index; revert start_index to reflect the pagination
offset (use options.address_queue_start_index) and add a new field (e.g.,
tree_next_insertion_index or insertion_index) to the response struct to carry
address_tree_bundle.right_most_index() as u64; update the response construction
where start_index and addresses are populated and adjust the response
type/struct (and any builders) to include the new field so callers can read the
sparse-tree next insertion index without overloading start_index.
---
Outside diff comments:
In `@prover/client/src/helpers.rs`:
- Around line 49-54: The function compute_root_from_merkle_proof currently takes
path_index: u32 which can overflow for large address trees; change its signature
to accept path_index: usize and update the internal use of
ChangelogEntry::default_with_index(path_index as usize) to pass the usize
directly (ChangelogEntry::default_with_index(path_index)), then update all
callers (the address-append sites that currently cast indices with as u32) to
pass their usize indices directly (e.g., low_element.index,
new_low_element.index, current_index) so there is no lossy cast or potential
wrap-around.
In `@prover/client/tests/batch_address_append.rs`:
- Around line 87-109: The test currently initializes changelog and
indexed_changelog as empty vectors so get_batch_address_append_circuit_inputs
only covers the first-batch path; pre-populate both changelog:
Vec<ChangelogEntry<{ DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> and
indexed_changelog: Vec<IndexedChangelogEntry<usize, {
DEFAULT_BATCH_ADDRESS_TREE_HEIGHT as usize }>> with one or more realistic
entries (matching the ChangelogEntry and IndexedChangelogEntry shapes used
elsewhere in tests) before calling get_batch_address_append_circuit_inputs so
the function exercises the proof-patching logic for non-empty prior entries;
ensure the synthetic entries use valid roots/indices/values consistent with the
sparse_merkle_tree and DEFAULT_BATCH_ADDRESS_TREE_HEIGHT expectations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 276f6819-fcb6-48df-a7d4-de182c3af1f6
⛔ Files ignored due to path filters (7)
Cargo.lockis excluded by!**/*.lockand included by noneforester-utils/src/address_staging_tree.rsis excluded by none and included by noneprogram-tests/utils/src/e2e_test_env.rsis excluded by none and included by noneprogram-tests/utils/src/mock_batched_forester.rsis excluded by none and included by noneprogram-tests/utils/src/test_batch_forester.rsis excluded by none and included by nonesparse-merkle-tree/src/indexed_changelog.rsis excluded by none and included by nonesparse-merkle-tree/tests/indexed_changelog.rsis excluded by none and included by none
📒 Files selected for processing (14)
forester/src/processor/v2/helpers.rsforester/src/processor/v2/strategy/address.rsprover/client/src/constants.rsprover/client/src/errors.rsprover/client/src/helpers.rsprover/client/src/proof_types/batch_address_append/proof_inputs.rsprover/client/src/proof_types/batch_append/proof_inputs.rsprover/client/src/proof_types/batch_update/proof_inputs.rsprover/client/src/prover.rsprover/client/tests/batch_address_append.rssdk-libs/client/src/indexer/photon_indexer.rssdk-libs/client/src/indexer/types/queue.rssdk-libs/client/src/interface/load_accounts.rssdk-libs/program-test/src/indexer/test_indexer.rs
prover/client/src/proof_types/batch_address_append/proof_inputs.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
prover/client/src/proof_types/batch_update/proof_inputs.rs (1)
139-178:⚠️ Potential issue | 🟠 MajorFail fast on mismatched batch-update inputs.
This loop only zips
leaves,merkle_proofs, andpath_indices, but still indexestx_hashes[i]. A shorttx_hashesvector, a short proof, or a hash failure still panics here even after makingcompute_root_from_merkle_proof()fallible.Possible fix
+ let expected_len = batch_size as usize; + if tx_hashes.len() != expected_len + || leaves.len() != expected_len + || old_leaves.len() != expected_len + || merkle_proofs.len() != expected_len + || path_indices.len() != expected_len + { + return Err(ProverClientError::GenericError(format!( + "Input vector length mismatch: tx_hashes={}, leaves={}, old_leaves={}, merkle_proofs={}, path_indices={}, expected={}", + tx_hashes.len(), + leaves.len(), + old_leaves.len(), + merkle_proofs.len(), + path_indices.len(), + expected_len, + ))); + } + - for (i, (leaf, (mut merkle_proof, index))) in leaves + for (i, ((leaf, tx_hash), (mut merkle_proof, index))) in leaves .iter() - .zip(merkle_proofs.into_iter().zip(path_indices.iter())) + .zip(tx_hashes.iter()) + .zip(merkle_proofs.into_iter().zip(path_indices.iter())) .enumerate() { @@ - let merkle_proof_array = merkle_proof.try_into().unwrap(); + let proof_len = merkle_proof.len(); + let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| { + ProverClientError::GenericError(format!( + "Invalid merkle proof length: got {}, expected {}", + proof_len, HEIGHT + )) + })?; @@ - let nullifier = Poseidon::hashv(&[leaf, &index_bytes, &tx_hashes[i]]).unwrap(); + let nullifier = Poseidon::hashv(&[leaf, &index_bytes, tx_hash])?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/proof_types/batch_update/proof_inputs.rs` around lines 139 - 178, The loop can panic because tx_hashes is indexed by i while the zip only covers leaves, merkle_proofs and path_indices; validate input lengths up front or include tx_hashes in the zip to avoid out-of-bounds access and remove unwraps: check that leaves.len() == merkle_proofs.len() == path_indices.len() == tx_hashes.len() and return a ProverClientError if not, replace merkle_proof.try_into().unwrap() with a fallible conversion that maps the error into ProverClientError, and ensure Poseidon::hashv and compute_root_from_merkle_proof errors are propagated rather than causing panics.prover/client/src/proof_types/batch_append/proof_inputs.rs (1)
135-194:⚠️ Potential issue | 🟠 MajorValidate the batch-append inputs before building the circuit.
This loop truncates to the shortest of
old_leaves,leaves, andmerkle_proofs, so a mismatch can silently produce fewer append proofs thanbatch_sizeclaims.merkle_proof.try_into().unwrap()also still turns a bad indexer response into a panic instead of aProverClientError.Possible fix
+ let expected_len = batch_size as usize; + if old_leaves.len() != expected_len + || leaves.len() != expected_len + || merkle_proofs.len() != expected_len + { + return Err(ProverClientError::GenericError(format!( + "Input vector length mismatch: old_leaves={}, leaves={}, merkle_proofs={}, expected={}", + old_leaves.len(), + leaves.len(), + merkle_proofs.len(), + expected_len, + ))); + } + for (i, (old_leaf, (new_leaf, mut merkle_proof))) in old_leaves .iter() .zip(leaves.iter().zip(merkle_proofs.into_iter())) .enumerate() { @@ - let merkle_proof_array = merkle_proof.try_into().unwrap(); + let proof_len = merkle_proof.len(); + let merkle_proof_array: [[u8; 32]; HEIGHT] = merkle_proof.try_into().map_err(|_| { + ProverClientError::GenericError(format!( + "Invalid merkle proof length: got {}, expected {}", + proof_len, HEIGHT + )) + })?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/proof_types/batch_append/proof_inputs.rs` around lines 135 - 194, The loop silently truncates to the shortest iterator (old_leaves, leaves, merkle_proofs) and panics on invalid proof conversion; validate input lengths and convert safely before building the circuit: check that old_leaves.len(), leaves.len(), and merkle_proofs.len() equal batch_size (and/or start_index + batch_size bounds) and return a ProverClientError on mismatch; replace merkle_proof.try_into().unwrap() with a fallible conversion that maps the error into ProverClientError (maintain context like current_index and batch_position); keep remaining logic (is_old_leaf_zero, compute_root_from_merkle_proof) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@forester/src/processor/v2/helpers.rs`:
- Around line 489-494: get_batch_snapshot currently accepts independent
caller-controlled cursors (start..end and hashchain_idx); enforce batch
alignment by computing the batch index from start using self.zkp_batch_size
(e.g., batch_idx = start / self.zkp_batch_size), require end == start +
self.zkp_batch_size, and derive the hash-chain index from that batch index
instead of trusting the caller-provided hashchain_idx (or validate that the
passed hashchain_idx equals the derived value and return an error if not).
Update get_batch_snapshot to validate these invariants and use the derived
hash-chain index when loading the hash chain; apply the same change to the
analogous helper at the other location referenced in the comment.
In `@prover/client/src/prover.rs`:
- Around line 29-31: The health_check_once function incorrectly returns true as
soon as prover_listener_present() is true; instead, remove the early return and
treat the listener as only a hint—always perform the actual HTTP readiness check
against the prover endpoint (the same logic used in the later HTTP probe block)
before returning true. Update health_check_once to, when
prover_listener_present() is true, attempt the HTTP GET health probe (or
call/reuse the existing HTTP check routine), only returning true on a successful
HTTP response; ensure spawn_prover() still starts the process if the HTTP probe
fails despite a listening port.
- Around line 136-188: The IS_LOADING flag can remain true if the awaited
startup block panics or calls expect!, so ensure IS_LOADING.store(false,
Ordering::Release) runs on every exit path of the startup sequence: wrap the
async startup block currently assigned to spawn_result in a panic::catch_unwind
(or use a scope guard/defer) so you can clear IS_LOADING in a finally-like
section after awaiting the block, then re-propagate any panic or return the
block's result; reference the IS_LOADING static, the spawn_result async block,
and the code paths that call panic! / expect! (Command::spawn, health_check
loop, child.try_wait) to place the cleanup.
In `@sdk-libs/client/src/indexer/photon_indexer.rs`:
- Around line 1693-1695: The assignment is wrong: do not set
tree_next_insertion_index to aq.start_index; use the queue's stored tree
insertion position instead so the staging tree aligns with the real next leaf.
Update the struct initialization where start_index: aq.start_index,
tree_next_insertion_index: aq.start_index, root_seq: aq.root_seq to set
tree_next_insertion_index from the queue's actual next-insertion field (e.g.,
aq.tree_next_insertion_index or the correct field that records the tree's next
leaf) so AddressStagingTree::new(..., start_index) and downstream alignment
checks use the true tree insertion index rather than the slice start.
---
Outside diff comments:
In `@prover/client/src/proof_types/batch_append/proof_inputs.rs`:
- Around line 135-194: The loop silently truncates to the shortest iterator
(old_leaves, leaves, merkle_proofs) and panics on invalid proof conversion;
validate input lengths and convert safely before building the circuit: check
that old_leaves.len(), leaves.len(), and merkle_proofs.len() equal batch_size
(and/or start_index + batch_size bounds) and return a ProverClientError on
mismatch; replace merkle_proof.try_into().unwrap() with a fallible conversion
that maps the error into ProverClientError (maintain context like current_index
and batch_position); keep remaining logic (is_old_leaf_zero,
compute_root_from_merkle_proof) unchanged.
In `@prover/client/src/proof_types/batch_update/proof_inputs.rs`:
- Around line 139-178: The loop can panic because tx_hashes is indexed by i
while the zip only covers leaves, merkle_proofs and path_indices; validate input
lengths up front or include tx_hashes in the zip to avoid out-of-bounds access
and remove unwraps: check that leaves.len() == merkle_proofs.len() ==
path_indices.len() == tx_hashes.len() and return a ProverClientError if not,
replace merkle_proof.try_into().unwrap() with a fallible conversion that maps
the error into ProverClientError, and ensure Poseidon::hashv and
compute_root_from_merkle_proof errors are propagated rather than causing panics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2250d3ec-f597-48a1-97ac-13a6819486f3
⛔ Files ignored due to path filters (3)
program-tests/utils/src/e2e_test_env.rsis excluded by none and included by noneprogram-tests/utils/src/mock_batched_forester.rsis excluded by none and included by noneprogram-tests/utils/src/test_batch_forester.rsis excluded by none and included by none
📒 Files selected for processing (18)
forester/src/processor/v2/helpers.rsforester/src/processor/v2/processor.rsforester/src/processor/v2/proof_worker.rsforester/src/processor/v2/strategy/address.rsprover/client/src/helpers.rsprover/client/src/proof_client.rsprover/client/src/proof_types/batch_address_append/proof_inputs.rsprover/client/src/proof_types/batch_append/proof_inputs.rsprover/client/src/proof_types/batch_update/proof_inputs.rsprover/client/src/prover.rsprover/client/tests/batch_address_append.rssdk-libs/client/src/indexer/photon_indexer.rssdk-libs/client/src/indexer/types/queue.rssdk-libs/client/src/interface/load_accounts.rssdk-libs/program-test/src/indexer/test_indexer.rssdk-libs/sdk-types/src/interface/program/decompression/pda.rssdk-libs/sdk-types/src/interface/program/decompression/processor.rssdk-libs/sdk-types/src/interface/program/decompression/token.rs
| pub fn get_batch_snapshot<const HEIGHT: usize>( | ||
| &self, | ||
| start: usize, | ||
| end: usize, | ||
| hashchain_idx: usize, | ||
| ) -> crate::Result<Option<AddressBatchSnapshot<HEIGHT>>> { |
There was a problem hiding this comment.
Bind the hash-chain cursor to the requested batch.
get_batch_snapshot still trusts two independent caller-controlled cursors: start..end and hashchain_idx. That lets a caller build a non-batch-sized snapshot or accidentally pair batch N's addresses/proofs with batch M's hash chain, which only fails later during proof generation/verification. This helper already knows self.zkp_batch_size, so it should enforce one aligned batch here and derive or validate the hash-chain index locally.
🩹 Suggested guardrails
pub fn get_batch_snapshot<const HEIGHT: usize>(
&self,
start: usize,
end: usize,
hashchain_idx: usize,
) -> crate::Result<Option<AddressBatchSnapshot<HEIGHT>>> {
+ if start >= end {
+ return Ok(None);
+ }
+
+ let batch_len = end - start;
+ if batch_len != self.zkp_batch_size || start % self.zkp_batch_size != 0 {
+ return Err(anyhow!(
+ "invalid address batch range {}..{} for zkp_batch_size {}",
+ start,
+ end,
+ self.zkp_batch_size
+ ));
+ }
+
+ let expected_hashchain_idx = start / self.zkp_batch_size;
+ if hashchain_idx != expected_hashchain_idx {
+ return Err(anyhow!(
+ "hashchain_idx {} does not match batch range {}..{} (expected {})",
+ hashchain_idx,
+ start,
+ end,
+ expected_hashchain_idx
+ ));
+ }
+
let available = self.wait_for_batch(end);
- if available < end || start >= end {
+ if available < end {
return Ok(None);
}Based on learnings: ZKP batch sizes for address trees must be 10 or 250, and tree updates must require zero-knowledge proofs with public inputs: old root, new root, hash chain, and next_index.
Also applies to: 524-541
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@forester/src/processor/v2/helpers.rs` around lines 489 - 494,
get_batch_snapshot currently accepts independent caller-controlled cursors
(start..end and hashchain_idx); enforce batch alignment by computing the batch
index from start using self.zkp_batch_size (e.g., batch_idx = start /
self.zkp_batch_size), require end == start + self.zkp_batch_size, and derive the
hash-chain index from that batch index instead of trusting the caller-provided
hashchain_idx (or validate that the passed hashchain_idx equals the derived
value and return an error if not). Update get_batch_snapshot to validate these
invariants and use the derived hash-chain index when loading the hash chain;
apply the same change to the analogous helper at the other location referenced
in the comment.
| if IS_LOADING | ||
| .compare_exchange(false, true, Ordering::AcqRel, Ordering::Acquire) | ||
| .is_err() | ||
| { | ||
| if health_check(120, 1).await { | ||
| return; | ||
| } | ||
| panic!("Failed to start prover, health check failed."); | ||
| } | ||
|
|
||
| let spawn_result = async { | ||
| let mut command = Command::new(prover_path); | ||
| command.arg("start-prover").stdout(Stdio::piped()).stderr(Stdio::piped()); | ||
| let mut child = command.spawn().expect("Failed to start prover process"); | ||
| let mut child_exit_status = None; | ||
|
|
||
| let command = Command::new(prover_path) | ||
| .arg("start-prover") | ||
| .spawn() | ||
| .expect("Failed to start prover process"); | ||
| for _ in 0..120 { | ||
| if health_check(1, 1).await { | ||
| info!("Prover started successfully"); | ||
| return; | ||
| } | ||
|
|
||
| let _ = command.wait_with_output(); | ||
| if child_exit_status.is_none() { | ||
| match child.try_wait() { | ||
| Ok(Some(status)) => { | ||
| tracing::warn!( | ||
| ?status, | ||
| "prover launcher exited before health check succeeded; continuing to poll for detached prover" | ||
| ); | ||
| child_exit_status = Some(status); | ||
| } | ||
| Ok(None) => {} | ||
| Err(error) => { | ||
| tracing::error!(?error, "failed to poll prover child process"); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| let health_result = health_check(120, 1).await; | ||
| if health_result { | ||
| info!("Prover started successfully"); | ||
| } else { | ||
| panic!("Failed to start prover, health check failed."); | ||
| sleep(Duration::from_secs(1)).await; | ||
| } | ||
|
|
||
| if let Some(status) = child_exit_status { | ||
| panic!( | ||
| "Failed to start prover, health check failed after launcher exited with status {status}." | ||
| ); | ||
| } | ||
|
|
||
| panic!("Failed to start prover, health check failed."); | ||
| } | ||
| .await; | ||
|
|
||
| IS_LOADING.store(false, Ordering::Release); | ||
| spawn_result |
There was a problem hiding this comment.
Clear IS_LOADING on every exit path.
The flag is reset only after the awaited startup block completes normally. Any panic! or expect() inside that block unwinds past Line 187 and leaves IS_LOADING stuck true, so later callers can no longer retry startup in the same process.
Possible fix
+ struct LoadingGuard;
+ impl Drop for LoadingGuard {
+ fn drop(&mut self) {
+ IS_LOADING.store(false, Ordering::Release);
+ }
+ }
+
+ let _loading_guard = LoadingGuard;
let spawn_result = async {
let mut command = Command::new(prover_path);
command.arg("start-prover").stdout(Stdio::piped()).stderr(Stdio::piped());
@@
}
.await;
-
- IS_LOADING.store(false, Ordering::Release);
spawn_result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@prover/client/src/prover.rs` around lines 136 - 188, The IS_LOADING flag can
remain true if the awaited startup block panics or calls expect!, so ensure
IS_LOADING.store(false, Ordering::Release) runs on every exit path of the
startup sequence: wrap the async startup block currently assigned to
spawn_result in a panic::catch_unwind (or use a scope guard/defer) so you can
clear IS_LOADING in a finally-like section after awaiting the block, then
re-propagate any panic or return the block's result; reference the IS_LOADING
static, the spawn_result async block, and the code paths that call panic! /
expect! (Command::spawn, health_check loop, child.try_wait) to place the
cleanup.
| start_index: aq.start_index, | ||
| tree_next_insertion_index: aq.start_index, | ||
| root_seq: aq.root_seq, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n -C4 'tree_next_insertion_index|queue_next_index|start_index' forester sdk-libs program-libs --glob '*.rs'Repository: Lightprotocol/light-protocol
Length of output: 50387
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the photon_indexer.rs file and extract the struct definition
find . -name "photon_indexer.rs" -type f | head -5Repository: Lightprotocol/light-protocol
Length of output: 119
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look at the context around lines 1693-1695
sed -n '1680,1710p' sdk-libs/client/src/indexer/photon_indexer.rs
# Also find where this struct/field is defined
echo "=== LOOKING FOR STRUCT DEFINITION ==="
rg "struct.*AddressQueue\|start_index:" sdk-libs/client/src/indexer/photon_indexer.rs -A 5 | head -40Repository: Lightprotocol/light-protocol
Length of output: 1388
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the AddressQueue type/struct definition in the response parsing
rg "struct.*AddressQueue|type.*AddressQueue" sdk-libs/client/src/indexer/photon_indexer.rs -B 2 -A 10 | head -60Repository: Lightprotocol/light-protocol
Length of output: 54
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look at more context around the response parsing to find the serde deserialization
sed -n '1600,1750p' sdk-libs/client/src/indexer/photon_indexer.rs | head -200Repository: Lightprotocol/light-protocol
Length of output: 6656
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the Photon API response types to understand what start_index represents
find . -type f -name "*.rs" | xargs rg "photon_api.*AddressQueue|api_response.*address_queue" -l | head -5
# Also search for where these types come from
rg "use.*photon_api" sdk-libs/client/src/indexer/photon_indexer.rs | head -10Repository: Lightprotocol/light-protocol
Length of output: 171
🏁 Script executed:
#!/bin/bash
set -euo zipfail
# Search for photon_api crate definition
find . -type f -name "photon_api*" -o -name "*photon*" | grep -E "Cargo.toml|lib.rs" | head -10
# Also check if photon_api is external
rg "photon-api|photon_api" Cargo.toml -l | head -5Repository: Lightprotocol/light-protocol
Length of output: 124
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search for photon_api crate definition
find . -type f -name "Cargo.toml" | xargs grep -l "photon" | head -10
# Check the Photon docs to understand what start_index is
find . -type d -name "*photon*" | head -10Repository: Lightprotocol/light-protocol
Length of output: 289
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look at the photon-api crate structure
ls -la sdk-libs/photon-api/
# Find type definitions for address queue
rg "struct.*AddressQueue|start_index" sdk-libs/photon-api/ -A 3 | head -80Repository: Lightprotocol/light-protocol
Length of output: 5363
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the external photon directory for documentation
ls -la external/photon/ | head -30
# Look for any comments or documentation about start_index
rg "start_index" external/photon/ -B 3 -A 3 | head -100Repository: Lightprotocol/light-protocol
Length of output: 178
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the docs mentioned in learnings
cat program-libs/batched-merkle-tree/docs/INSERT_ADDRESS_QUEUE.mdRepository: Lightprotocol/light-protocol
Length of output: 5337
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check the update docs to understand tree_next_insertion_index
cat program-libs/batched-merkle-tree/docs/UPDATE_FROM_ADDRESS_QUEUE.md | head -150Repository: Lightprotocol/light-protocol
Length of output: 170
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find docs files
find program-libs -name "*.md" -type f | head -20
# Also check for the CLAUDE.md file
cat program-libs/batched-merkle-tree/docs/CLAUDE.mdRepository: Lightprotocol/light-protocol
Length of output: 10407
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look at how streaming_queue is used - what does tree_next_insertion_index() return?
sed -n '580,600p' forester/src/processor/v2/helpers.rs
# Check the structure of StreamingAddressQueue
rg "impl StreamingAddressQueue" forester/src/processor/v2/helpers.rs -A 50 | head -80Repository: Lightprotocol/light-protocol
Length of output: 2617
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find where streaming_queue data is populated from photon response
rg "StreamingAddressQueue\|tree_next_insertion_index" forester/src/processor/v2/helpers.rs -B 5 -A 5 | head -100Repository: Lightprotocol/light-protocol
Length of output: 54
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look at where the streaming queue wraps the photon response data
rg "StreamingAddressQueue::new|tree_next_insertion_index:" forester/src/processor/v2/ -B 3 -A 3 | head -100Repository: Lightprotocol/light-protocol
Length of output: 54
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Let me check test_indexer to see how it distinguishes start_index from tree_next_insertion_index
sed -n '720,735p' sdk-libs/program-test/src/indexer/test_indexer.rsRepository: Lightprotocol/light-protocol
Length of output: 833
Line 1694 incorrectly assigns aq.start_index to tree_next_insertion_index—these are separate concerns.
The test indexer (lines 729–730) and forester's AddressStagingTree initialization confirm the distinction:
start_index: where the indexed queue slice beginstree_next_insertion_index: the tree's actual next insertion position (the leaf index the staging tree should initialize to)
When forester calls AddressStagingTree::new(..., start_index) at line 183 of forester/src/processor/v2/strategy/address.rs, it uses this value to track which leaves have already been processed. If tree_next_insertion_index is set to the queue's slice start instead of the tree's true next leaf position, alignment checks will fail and batches will be skipped or processed from the wrong position.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@sdk-libs/client/src/indexer/photon_indexer.rs` around lines 1693 - 1695, The
assignment is wrong: do not set tree_next_insertion_index to aq.start_index; use
the queue's stored tree insertion position instead so the staging tree aligns
with the real next leaf. Update the struct initialization where start_index:
aq.start_index, tree_next_insertion_index: aq.start_index, root_seq: aq.root_seq
to set tree_next_insertion_index from the queue's actual next-insertion field
(e.g., aq.tree_next_insertion_index or the correct field that records the tree's
next leaf) so AddressStagingTree::new(..., start_index) and downstream alignment
checks use the true tree insertion index rather than the slice start.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sdk-libs/sdk-types/src/interface/account/token_seeds.rs (1)
268-280:⚠️ Potential issue | 🟠 MajorDon’t silently ignore
output_queue_index; enforce consistency or remove it end-to-end.Line 268 makes
output_queue_indexunused, while Line 280 now sourcesqueue_pubkey_indexonly fromtree_info. Since callers still passoutput_queue_index(seesdk-libs/sdk-types/src/interface/program/decompression/token.rs), mismatches become silent and can produce incorrect merkle context assembly.Proposed guard to prevent silent divergence
fn into_in_token_data( &self, tree_info: &PackedStateTreeInfo, - _output_queue_index: u8, + output_queue_index: u8, ) -> Result<MultiInputTokenDataWithContext, LightSdkTypesError> { + if output_queue_index != tree_info.queue_pubkey_index { + return Err(LightSdkTypesError::InvalidInstructionData); + } Ok(MultiInputTokenDataWithContext { amount: self.token_data.amount, mint: self.token_data.mint, owner: self.token_data.owner, version: self.token_data.version, has_delegate: self.token_data.has_delegate, delegate: self.token_data.delegate, root_index: tree_info.root_index, merkle_context: PackedMerkleContext { merkle_tree_pubkey_index: tree_info.merkle_tree_pubkey_index, - queue_pubkey_index: tree_info.queue_pubkey_index, + queue_pubkey_index: output_queue_index, leaf_index: tree_info.leaf_index, prove_by_index: tree_info.prove_by_index, }, }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sdk-libs/sdk-types/src/interface/account/token_seeds.rs` around lines 268 - 280, The function currently accepts output_queue_index but ignores it when building PackedMerkleContext (using tree_info.queue_pubkey_index), which can cause silent mismatches; update the constructor logic in the function that returns MultiInputTokenDataWithContext to either (a) validate that output_queue_index == tree_info.queue_pubkey_index and return an Err on mismatch, or (b) use the provided output_queue_index when setting PackedMerkleContext.queue_pubkey_index (and document the precedence), and ensure callers in token.rs remain compatible; refer to symbols output_queue_index, tree_info, PackedMerkleContext, and MultiInputTokenDataWithContext to locate and fix the code path.
♻️ Duplicate comments (2)
prover/client/src/prover.rs (2)
42-44:⚠️ Potential issue | 🟠 MajorDon’t treat a listening port as healthy service.
Line 42-Line 44 still returns healthy before an HTTP probe. A stale or unrelated listener can bypass startup and break downstream calls.
Suggested fix
fn health_check_once(timeout: Duration) -> bool { - if prover_listener_present() { - return true; - } + let listener_present = prover_listener_present(); @@ let mut stream = match TcpStream::connect_timeout(&addr, timeout) { Ok(stream) => stream, Err(error) => { - tracing::debug!(?error, endpoint, "prover health TCP connect failed"); + tracing::debug!(?error, endpoint, listener_present, "prover health TCP connect failed"); return health_check_once_with_curl(timeout); } };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/prover.rs` around lines 42 - 44, The health check currently returns true if prover_listener_present() returns true, which lets any open listener mark the service healthy; remove this early return and instead perform the actual readiness probe (e.g., send an HTTP GET to the expected health/readiness endpoint or call the real service initialization check) before returning true. Update the function that contains prover_listener_present() (the health-check/ready-check function) to only consider prover_listener_present() as a preliminary hint and only return healthy after the HTTP probe (or the concrete readiness function such as check_http_probe(), perform_readiness_probe(), or the existing HTTP client call) succeeds; ensure proper error propagation/logging when the probe fails.
152-167:⚠️ Potential issue | 🟠 MajorReset
IS_LOADINGeven when startup panics.Line 166 is not reached on panic paths inside the startup block, so
IS_LOADINGcan remaintrueand deadlock subsequent startup attempts.Suggested fix
+ struct LoadingGuard; + impl Drop for LoadingGuard { + fn drop(&mut self) { + IS_LOADING.store(false, Ordering::Release); + } + } + let _loading_guard = LoadingGuard; + let spawn_result = async { Command::new(&prover_path) .arg("start-prover") .spawn() .unwrap_or_else(|error| panic!("Failed to start prover process: {error}")); @@ } .await; - - IS_LOADING.store(false, Ordering::Release); spawn_result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prover/client/src/prover.rs` around lines 152 - 167, The IS_LOADING flag can remain true if the async startup block panics, so create an RAII guard (e.g., a small LoadingGuard struct) that calls IS_LOADING.store(false, Ordering::Release) in its Drop impl, instantiate it before running the async startup block (the block that uses Command::new(&prover_path), health_check, and assigns spawn_result), and remove the explicit IS_LOADING.store(false, Ordering::Release) at the end; this ensures IS_LOADING is cleared regardless of panics during the startup sequence.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@prover/client/src/prover.rs`:
- Around line 173-184: The health_check function currently calls the blocking
helper health_check_once directly inside the async runtime; change it to offload
blocking work by calling tokio::task::spawn_blocking(||
health_check_once(timeout)) and await the JoinHandle (e.g., let ok =
tokio::task::spawn_blocking(move ||
health_check_once(timeout)).await.unwrap_or(false)); use that boolean result in
the existing logic (return true if ok), ensuring health_check_once remains sync
and any blocking I/O runs inside spawn_blocking so the Tokio worker thread is
not blocked.
In `@sdk-libs/program-test/src/indexer/test_indexer.rs`:
- Around line 2577-2621: The parse/read error handling currently uses `?` inside
the success branch (specifically the
`response_result.text().await.map_err(|error| { ... })?` and
`deserialize_gnark_proof_json(&body).map_err(|error| ...) ?` calls), which
short-circuits out of the retry loop; replace those `?` usages by handling
errors locally: call `response_result.text().await` and if it Err, set
`last_error` to a descriptive message (similar to the existing error branches)
and continue the retry loop instead of returning; likewise call
`deserialize_gnark_proof_json(&body)` and on Err set `last_error` (including a
body-preview if helpful) and continue; only when both read and deserialize
succeed proceed to `proof_from_json_struct`, `compress_proof`, and return
`Ok(ValidityProofWithContext { ... })`; keep using the existing `last_error` and
`retries` logic so network/parse failures trigger retries instead of immediate
failure.
---
Outside diff comments:
In `@sdk-libs/sdk-types/src/interface/account/token_seeds.rs`:
- Around line 268-280: The function currently accepts output_queue_index but
ignores it when building PackedMerkleContext (using
tree_info.queue_pubkey_index), which can cause silent mismatches; update the
constructor logic in the function that returns MultiInputTokenDataWithContext to
either (a) validate that output_queue_index == tree_info.queue_pubkey_index and
return an Err on mismatch, or (b) use the provided output_queue_index when
setting PackedMerkleContext.queue_pubkey_index (and document the precedence),
and ensure callers in token.rs remain compatible; refer to symbols
output_queue_index, tree_info, PackedMerkleContext, and
MultiInputTokenDataWithContext to locate and fix the code path.
---
Duplicate comments:
In `@prover/client/src/prover.rs`:
- Around line 42-44: The health check currently returns true if
prover_listener_present() returns true, which lets any open listener mark the
service healthy; remove this early return and instead perform the actual
readiness probe (e.g., send an HTTP GET to the expected health/readiness
endpoint or call the real service initialization check) before returning true.
Update the function that contains prover_listener_present() (the
health-check/ready-check function) to only consider prover_listener_present() as
a preliminary hint and only return healthy after the HTTP probe (or the concrete
readiness function such as check_http_probe(), perform_readiness_probe(), or the
existing HTTP client call) succeeds; ensure proper error propagation/logging
when the probe fails.
- Around line 152-167: The IS_LOADING flag can remain true if the async startup
block panics, so create an RAII guard (e.g., a small LoadingGuard struct) that
calls IS_LOADING.store(false, Ordering::Release) in its Drop impl, instantiate
it before running the async startup block (the block that uses
Command::new(&prover_path), health_check, and assigns spawn_result), and remove
the explicit IS_LOADING.store(false, Ordering::Release) at the end; this ensures
IS_LOADING is cleared regardless of panics during the startup sequence.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 12323e3c-a7c5-473c-847a-f66e3b6762e8
⛔ Files ignored due to path filters (1)
sdk-tests/csdk-anchor-full-derived-test/tests/basic_test.rsis excluded by none and included by none
📒 Files selected for processing (6)
prover/client/src/errors.rsprover/client/src/prover.rssdk-libs/client/src/interface/instructions.rssdk-libs/program-test/src/indexer/test_indexer.rssdk-libs/sdk-types/src/interface/account/token_seeds.rssdk-libs/sdk-types/src/interface/program/decompression/pda.rs
| pub async fn health_check(retries: usize, timeout: usize) -> bool { | ||
| let client = reqwest::Client::new(); | ||
| let mut result = false; | ||
| for _ in 0..retries { | ||
| match client | ||
| .get(format!("{}{}", SERVER_ADDRESS, HEALTH_CHECK)) | ||
| .send() | ||
| .await | ||
| { | ||
| Ok(_) => { | ||
| result = true; | ||
| break; | ||
| } | ||
| Err(_) => { | ||
| sleep(Duration::from_secs(timeout as u64)); | ||
| } | ||
| let timeout = Duration::from_secs(timeout as u64); | ||
| let retry_delay = timeout; | ||
|
|
||
| for attempt in 0..retries { | ||
| if health_check_once(timeout) { | ||
| return true; | ||
| } | ||
|
|
||
| if attempt + 1 < retries { | ||
| sleep(retry_delay).await; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify blocking std I/O and process calls are executed from async health-check path.
set -euo pipefail
rg -nP --type=rust '\bpub\s+async\s+fn\s+health_check\b|health_check_once\(' prover/client/src/prover.rs -C3
rg -nP --type=rust 'TcpStream::connect_timeout|\.read\(|Command::new\("lsof"\)|Command::new\("curl"\)' prover/client/src/prover.rs -C2Repository: Lightprotocol/light-protocol
Length of output: 1319
🏁 Script executed:
# Check for existing spawn_blocking usage patterns in the codebase
rg -n 'spawn_blocking' --type=rust | head -20Repository: Lightprotocol/light-protocol
Length of output: 637
🏁 Script executed:
# Check if CLAUDE.md exists and contains async/blocking guidelines
find . -name 'CLAUDE.md' -o -name 'claude.md' | xargs cat 2>/dev/null || echo "CLAUDE.md not found"Repository: Lightprotocol/light-protocol
Length of output: 50386
🏁 Script executed:
# Find all callers of health_check to understand the context
rg -n '\bhealth_check\(' --type=rust -B2 -A2 | grep -v 'fn health_check'Repository: Lightprotocol/light-protocol
Length of output: 1234
🏁 Script executed:
# Verify the exact lines and confirm the diff context
sed -n '173,190p' prover/client/src/prover.rsRepository: Lightprotocol/light-protocol
Length of output: 436
🏁 Script executed:
# Check if health_check_once is awaitable or blocking
rg -n 'fn health_check_once' prover/client/src/prover.rs -A1Repository: Lightprotocol/light-protocol
Length of output: 294
Move blocking health probes off the Tokio worker thread.
The health_check_once call at line 178 performs blocking socket and process I/O (TcpStream::connect_timeout, stream.read, Command::new) directly within an async function. This blocks runtime workers and can degrade other async tasks waiting on the executor.
Wrap the blocking work in tokio::task::spawn_blocking:
Suggested fix
pub async fn health_check(retries: usize, timeout: usize) -> bool {
let timeout = Duration::from_secs(timeout as u64);
let retry_delay = timeout;
for attempt in 0..retries {
- if health_check_once(timeout) {
+ let check_timeout = timeout;
+ let ok = tokio::task::spawn_blocking(move || health_check_once(check_timeout))
+ .await
+ .unwrap_or(false);
+ if ok {
return true;
}
if attempt + 1 < retries {
sleep(retry_delay).await;
}
}
+
+ false
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub async fn health_check(retries: usize, timeout: usize) -> bool { | |
| let client = reqwest::Client::new(); | |
| let mut result = false; | |
| for _ in 0..retries { | |
| match client | |
| .get(format!("{}{}", SERVER_ADDRESS, HEALTH_CHECK)) | |
| .send() | |
| .await | |
| { | |
| Ok(_) => { | |
| result = true; | |
| break; | |
| } | |
| Err(_) => { | |
| sleep(Duration::from_secs(timeout as u64)); | |
| } | |
| let timeout = Duration::from_secs(timeout as u64); | |
| let retry_delay = timeout; | |
| for attempt in 0..retries { | |
| if health_check_once(timeout) { | |
| return true; | |
| } | |
| if attempt + 1 < retries { | |
| sleep(retry_delay).await; | |
| } | |
| pub async fn health_check(retries: usize, timeout: usize) -> bool { | |
| let timeout = Duration::from_secs(timeout as u64); | |
| let retry_delay = timeout; | |
| for attempt in 0..retries { | |
| let check_timeout = timeout; | |
| let ok = tokio::task::spawn_blocking(move || health_check_once(check_timeout)) | |
| .await | |
| .unwrap_or(false); | |
| if ok { | |
| return true; | |
| } | |
| if attempt + 1 < retries { | |
| sleep(retry_delay).await; | |
| } | |
| } | |
| false | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@prover/client/src/prover.rs` around lines 173 - 184, The health_check
function currently calls the blocking helper health_check_once directly inside
the async runtime; change it to offload blocking work by calling
tokio::task::spawn_blocking(|| health_check_once(timeout)) and await the
JoinHandle (e.g., let ok = tokio::task::spawn_blocking(move ||
health_check_once(timeout)).await.unwrap_or(false)); use that boolean result in
the existing logic (return true if ok), ensuring health_check_once remains sync
and any blocking I/O runs inside spawn_blocking so the Tokio worker thread is
not blocked.
Summary by CodeRabbit
New Features
Refactor
Public API
Tests
Bug Fixes